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ABSTRACT 

This  paper  is  concerned  with  user  aspects  of  proof  in  the  specification  language  Z.  A 
number  of  techniques  for  proving  theorems  in  Z  are  presented.  Some  of  the 
techniques  are  not  new,  being  drawn  from  the  area  of  general  mathematical  theorem 
proving,  but  are  brought  together  in  one  place  and  applied  specifically  to  Z.  The 
techniques  have  been  written  so  that  they  can  be  understood  and  applied  by  a  person 
carrying  out  a  pen  and  paper  proof.  Some  of  the  techniques  may  be  automated,  and 
would  result  in  a  proof  tool  tailored  to  the  needs  of  the  user;  currently  a  user  has  to 
tailor  a  proof  to  fit  a  tool. 
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1  Introduction 


The  specification  language  Z  has  been  developed  to  a  stage  where  it  is  suitable  for 
writing  large  specifications.  The  Z  refnence  manual  [Spivey]  gives  a  good  woricing 
definition  and  the  language  has  been  submitted  for  standardizadon.  There  are  many 
examples  of  its  use,  the  best  known  being  the  set  of  case  studies  in  [Hayes].  There  are 
many  introductoiy  texts  while  [Gravell]  and  [Macdonald]  present  techniques  for  writing 
clearer  specifications,  so  that  they  are  easier  to  understand.  There  are  also  a  number  of 
editors,  syntax  and  type  checkers.  However,  an  important  process  in  understanding  and 
validating  a  specification  is  to  reason  about  it  There  are  no  guidelines  on  how  a  proof 
should  be  conducted  or  presented.  In  particular  there  is  no  standard  syntax  for  theorems 
in  Z.  The  only  published  syntax  for  Z  which  contains  a  notation  for  theorems  is  [King]. 
There  is  also  a  lack  of  tools  and  techniques  for  carrying  out  proofs. 

This  paper  presents  a  number  of  useful  techniques  for  proving  theorems  in  Z.  The  paper 
does  not  contain  a  logic  for  Z  (for  this  see  [Woodcock]),  but  rather  a  collection  of 
heuristics.  Some  of  the  techniques  are  not  new,  being  drawn  fi'om  the  area  of  general 
mathematical  theorem  proving.  But  it  is  useful  to  see  these  techniques  brought  together  in 
one  place  and  applied  specifically  to  Z.  The  techniques  have  been  written  so  that  they  can 
be  understood  and  applied  by  a  person  carrying  out  a  pen  and  paper  proof.  However,  they 
could  also  be  automated,  resulting  in  a  tool  tailored  to  the  ne^s  of  the  user,  currently  all 
too  often  a  user  has  to  tailor  a  proof  to  fit  a  tool  [Smith  b]. 

The  content  of  the  paper  is  as  follows.  Section  2  contains  a  useful  proof  technique  known 
as  generalization.  This  is  where  a  theorem  is  strengthened,  roughly  keeping  its  original 
structure,  so  that  the  new  theorem  is  more  useful,  and  surprisingly,  sometimes  easier  to 
prove.  The  original  theorem  follows  as  a  special  case  of  the  generalized  theorem.  Section 
3  contains  techniques  for  proof  by  induction.  One  technique  shows  how  to  choose  the 
right  induction  schema  and  induction  variable,  while  another  shows  how  to  finish  the 
proof  once  the  induction  step  has  been  performed.  Section  4  explains  a  style  of  reasoning 
known  as  window  inference.  This  is  a  technique  where  a  user  can  transform  an  expression 
by  restricting  attention,  or  windowing,  on  a  subexpression.  In  this  section,  window 
inference  is  formalized  so  that  it  can  be  automated.  Section  5  presents  a  technique  to  help 
check  the  consistency  of  a  Z  specification.  The  technique  is  for  proving  the  existence  of  a 
recursive  function  defined  over  a  recursive  free  type.  The  technique  extends  that  in 
(Smith  a]  so  that  a  wider  class  of  functions  are  covered.  Section  6  discusses  reasoning  at 
the  schema  level,  and  presents  some  useful  laws  for  calculating  preconditions.  Section  7 
discusses  the  conflict  ^tween  specification  and  proof.  Sometimes  a  theorem  is  easier  to 
prove  from  an  alternative  specification.  But  this  alternative  specification  can  be  harder  to 
understand.  Finally  section  8  contains  the  conclusions  of  the  paper,  and  suggestions  for 
further  work. 


2  Generalization 

Sometimes  it  is  easier  to  prove  a  theorem  by  generalizing  it.  The  original  theorem  then 
follows  as  a  special  case  of  the  generalized  theorem.  At  tot  sight  this  seems  odd.  How 
can  a  more  powerful  theorem  be  easier  to  prove?  A  reason  is  that  it  removes  irrelevant 
detail  from  the  problem.  This  section  contains  two  techniques  for  generalization:  the  first 
is  to  replace  a  subterm  in  a  theorem  by  a  variable;  the  second  is  to  use  higher  order 
functions.  A  generalized  theorem  is  also  more  useful,  because  it  can  be  applied  to  more 
cases.  It  can  therefore  not  only  be  used  to  obtain  the  original  theorem,  but  other  theorems 
as  well.  Since  a  generalized  theorem  is  more  useful  and  sometimes  easier  to  prove, 
generalization  is  of  interest  to  a  person  carrying  out  a  pen  and  paper  proof  It  is  also  of 
interest  to  a  tool  builder  because  the  first,  and  part  of  the  second  technique,  can  be 
automated. 
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2.1  Replacing  a  Subterm  by  a  Variable 

The  first  technique  for  generalizing  a  theorem  is  to  replace  every  occurrence  of  a 
particular  subterm  by  a  variable,  thus  removing  unnecessary  clutter.  The  technique  is 
described  in  [Bird]  and  [Brumfitt],  and  is  illustrated  in  the  next  example.  All  examples  in 
this  paper  start  with  the  word  Example  and  end  with  the  symbol  A. 

Example  1 

Consider  the  following  specification 


-  -  I 

.-  <RI  X  W)  -»  W 

V  ..  n  • 

+  m  m 

(n)  +  m  ”  s  (n  +  m) 


—  -  I 

sum  ;  W  — >  Rt 

V  /)  ;  W  . 

sum  0=0 

sum(s  n)  =  s(n)  +  sum(n) 


The  function  +  is  addition  over  s  is  the  successor  function,  and  the  expression  sum  n 
is  the  sum  of  the  first  n  natural  numbers.  Now  consider  the  theorem 

►  V  •  sum(n)  +  0  =  sum(n)  (A) 

This  theorem  can  be  proved  by  induction  over  W,  using  the  above  axioms.  But  during  the 
proof,  a  lemma  (the  associativity  of  +)  is  also  required.  Again  this  would  have  to  be 
proved  using  induction.  So  the  total  effort  in  proving  theorem  a  is  two  induction  proofs. 
Now  it  is  easier  to  prove  a  generalization  of  theorem  a,  namely 

►  V  X  ;  •  X  +  0  =  X  (B) 

Notice  how  the  subterm  sum(n)  in  theorem  a  (which  appears  in  two  places)  has  been 
replaced  by  a  simple  variable  x.  This  has  removed  the  function  sum  which  is  just  clutter 
in  the  original  theorem.  Once  theorem  s  has  been  proved,  theorem  a  follows  immediately 
by  specializing  x  with  sum  <n) .  The  proof  of  s  is  again  by  induction  over  but  does  not 
require  a  lemma.  Not  only  is  theorem  b  easier  to  prove,  it  is  more  useful.  It  can  be  used  to 
prove  the  original  theorem  a  without  induction,  by  specializing  x  with  sumfnj.  Similarly 
it  can  be  used  to  prove  the  theorem 

►  V  n  •  nf  0  =  n ! 

where  .'  is  the  factorial  function,  by  specializing  x  with  n!  .A 

Another  case  when  a  more  general  theorem  can  be  easier  to  prove  is  in  proof  by 
induction.  Although  generalizing  might  strengthen  the  induction  conclusion,  it  also 
strengthens  the  induction  hypothesis.  This  stronger  hypothesis  can  help  in  proving  the 
theorem.  The  next  example  illustrates  this,  using  the  same  technique  of  replacing  a 
subterm  with  a  variable. 
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Example  2 

Consider  the  following  quick  reverse  function  for  sequences 

:  [xj  r-  '  =rr 

grev  :  (3eg  X  x  seq  X)  -*  seq  X 

V  X  ••  X/  3,  t  :  seq  X  • 
qrev(0,  t;  -  t 

g'rev^<x>*a,  t)  -  qzev(3,  <x>“t; 


Consider  the  following  theorem 

►  Vs.*  seq  X  •  rev  s  “  qrevfs,  Q)  (A) 

where  rev  is  the  ordinary  reverse  function  as  defined  in  [Spivey].  Proving  this  theorem 
by  induction  (over  sequences)  on  s  does  not  work.  The  problem  is  in  the  step  case,  where 
the  induction  hypothesis  is  not  strong  enough.  But  proving  a  generalization  of  the 
theorem  is  successful.  The  generalization  is  to  notice  that  theorem  a  can  be  written 

►  Vs.*  seq  X  •  (rev  s>  ~  0  ~  qrev(s,  0) 
and  then  generalized  to 

►  V  s,  t  :  seq  X  ♦  (rev  s)  '  t  =  qrev(s,  t)  (B) 

where  the  subexpression  0  has  been  replaced  by  the  variable  t.  Proving  B  by  induction 
on  s  now  gives  a  stronger  induction  hypothesis,  sufficient  to  prove  the  theorem.  A 


2.1.1  Strengths  and  Weaknesses 

The  technique  of  replacing  a  subterm  with  a  variable  could  be  automated,  and  is  therefore 
also  of  interest  to  a  tool  builder.  Indeed,  the  technique  has  been  automated  in  the 
Boyer-Moore  theorem  prover  [Boyer].  Unfonunatcly,  if  the  subterm  is  chosen  arbitrarily, 
the  generalized  version  might  not  be  a  theorem.  In  order  to  avoid  this,  an  understanding 
of  the  problem  is  necessary.  The  technique  is  still  safe,  because  the  user  would  not  be 
able  to  prove  the  generalized  version.  The  next  example  illustrates  this. 

Example  3 

Consider  the  theorem 


►  V  X  :  X  •  rev  (x)  =  (x) 

Replacing  the  subterm  <x)  by  a  variable  y  gives 

►  V  y  ;  seq  X  •  rev  y  =  y 
which  is  obviously  nonsense.  A 


12  Using  Higher  Order  Functions 

A  second  technique  for  generalizing  theorems  is  to  use  higher  order  functions.  This 
technique  is  described  in  [Bird]  and  [Brumfitt].  In  this  section,  a  higher  order  function 
will  be  one  which  takes  a  function  as  argument,  and  gives  a  function  as  result. 


In  general,  a  theorem  contains  a  number  of  objects,  for  example  functions,  relations  or 
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sets.  These  objects  have  a  number  of  properties,  but  the  theorem  will  only  depend  on 
some  of  these  properties.  The  theorem  will  also  hold  for  other  objects,  provided  they  too 
have  these  requir^  propenies.  Higher  order  functions  allow  these  required  properties  to 
be  separated  ^m  the  irrelevant  properties.  For  example,  consider  a  theorem  concerning 
sequences.  Suppose  that  the  theorem  only  depends  on  properties  of  sequences 
{collections  of  elements),  and  not  on  properties  of  individual  elements  of  sequences.  A 
Wgher  order  function  can  be  used  to  abstract,  or  lift,  the  properties  about  the  collection  of 
elements  from  properties  about  individual  elements.  There  is  an  analogy  here  with  the 
technique  of  promotion  in  Z.  This  is  where  a  framing  schema  (analogous  to  a  higher 
order  function)  lifts  the  properties  about  collections  of  objects  away  from  properties 
about  individual  objects.  The  next  example  illustrates  the  use  of  higher  order  functions  to 
generalize  a  theorem. 

Example  4 

Consider  the  function  square_seq  which  squares  every  element  of  a  sequence  of 
integers.  For  example,  s<juare_seg  <5,  i,  -2)  ’{25,  i.  4>.  Formally 


sguare_seg  ;  seq  Tl  —*  ^eg  i 

V  i  ;  S  :  seq  * 
square_seq  ()  =  () 

square_seq  ((i)  ~  s)  =  (i  *  i>  '  (square_seq  s) 

_ I 


Consider  the  following  theorem 

►  V  s,  t  ;  seq  2:  •  (A) 

square_seq  (s  ~  Z)  =  (square_seq  s)  ~  (square_seq  t) 

If  the  function  sguare_seg  in  the  above  theorem  is  replaced  by  the  function 
adci_one_seq  (which  adds  One  to  every  element  of  a  sequence),  then  the  theorem  still 
holds.  In  fact,  the  operation  on  the  individual  elements  of  the  sequence  (for  example 
squaring  and  adding  one)  is  irrelevant.  This  fact  can  be  captured  by  using  the  higher 
order  function  map,  familiar  from  functional  programming  (see  for  example  [Bird]),  and 
defined  as 


=1 

map  :  d  Zi)  —*  (seq  Z(  — >  seq  Zi) 

V  f  :  ZL  Z;  i  :  ZL;  s  :  seq  Z  • 
map  f  0  =  Q 

map  f  ({i)  ~  s)  =  (f  i)  ~  (map  f  s) 


Notice  the  similarity  between  the  definition  of  map  and  that  of  square_seq.  Notice  how 
the  panicular  operation  on  the  individual  elements,  i  *i,  has  been  replaced  by  the  general 
operation  f  i.  The  functions  sguare  segand  add_one_seq,  for  example,  can  be  written 

square  ==X.i  ;  ^  •  j  *  i 
add_one  ==Xi:Zi’i+l 

square_seq  ==  map  square 
add_one_seq  ==  map  add_one 

The  generalized  version  of  theorem  a  is  then  obtained  by  replacing  square_seq  by  map 
f,  and  universally  quantifying  over  f,  to  obtain 

►  V  f  :  7L  ZC;  s,  t  .•  seq  Zl  • 

map  f  (s  ~  t)  =  (map  f  s)  ~  (map  f  t) 
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(B) 


Theorem  b  can  be  proved  by  induction  on  s,  and  then  simply  specialized  to  obtain 
theorem  a  and  similar  theorems  with  funcdons  other  than  3guare_3eq.  As  an  aside, 
notice  that  the  type  2  in  theorem  a  could  also  be  generalized  (to  the  generic  type  x  say), 
since  the  type  of  the  elements  in  the  sequence  is  also  irrelevant.  This  would  mean 
defming  wap  to  be  generic  in  x  and  replacing  every  occurrence  of  Z  in  theorem  a  by  x.  A 

The  above  example  showed  the  use  of  a  higher  order  function  to  generalize  a  theorem 
concerning  sequences.  Sequences  are  very  similar  to  lists,  and  lists  can  be  specified  in  Z 
using  the  free  type  mechanism.  This  suggests  that  the  technique  of  using  higher  order 
functions  can  also  be  applied  to  theorems  involving  free  types.  This  is  illusn^ted  in  the 
next  example. 

Example  5 

Consider  the  free  type 

LIST  nil  /  join  «  2:  x  LIST  » 

which  consists  of  lists  of  integers.  Generalizing  theorems  involving  list  is  similar  to  the 
problem  of  generalizing  theorems  involving  sequences  in  example  4.  For  example,  if 
is  concatenation  of  lists  and  square_iist  is  a  function  which  squares  every  element  of 
a  list,  then  theorems  such  as 

►  V  J,  m  ;  LIST  • 

sguare_list  (1  ^  m)  =  (sguare_list  1)  ''  (sguare_list  m) 

can  be  generalized  in  a  similar  way  as  in  example  4.  The  generalization  would  involve  a 
higher  order  function,  similar  to  map,  but  for  lists  rather  than  sequences.  A 


2.2.1  Strengths  and  Weaknesses 

It  has  been  shown  that  theorems  involving  higher  order  functions  are  very  useful.  They 
can  be  stored  in  a  library  and  repeatedly  used  to  obtain  other  theorems.  This  cuts  down 
the  proof  effon.  But  where  do  these  higher  order  functions  come  from?  It  is  significantly 
more  difficult  to  find  a  higher  order  function,  than  to  replace  a  subterm  with  a  variable. 
But  some  help  can  be  offered.  For  example,  if  the  theorem  in  example  4  is  proved 
without  using  higher  order  functions,  then  it  can  be  seen  that  no  propeny  of 
multiplication  is  used  (the  reader  might  like  to  try  this).  This  means  that  the  operation  on 
the  individual  elements  of  the  sequences  is  irrelevant.  Therefore,  the  higher  order 
function  map,  needs  only  to  consider  an  arbitrary  operation  on  the  elements.  As  far  as 
theorems  involving  free  types  are  concerned,  a  single  higher  order  function  can  be 
automatically  generated  from  the  primitive  recursion  theorem  (PRT)  for  the  free  type 
[Smith  a],  lliis  function  can  be  us^  to  express  any  primitive  recursive  function  over  the 
free  type.  The  work  described  here  is  an  extension  to  [Smith  a].  The  next  example  shows 
how  the  PRT  for  the  free  type  list  in  example  5  can  be  used  to  obtain  this  single  higher 
order  function.  It  is  then  us^  to  define  two  functions  over  list. 

Example  6 

The  PRT  for  the  free  type  list  in  example  5  is  (from  [Smith  a]) 

►  V  X  :  X/  /  rx  X  2  X  LIST)  -*  X  • 

3j  h  :  LIST  -»  X  • 

h  nil  ”  X 

V  i  ;  1  :  LIST  •  htjointi,  1))  =  f  (hi,  i,  1) 

The  theorem  says  that  a  primitive  recursive  function  h,  over  the  free  type  list,  is 
uniquely  defined  by  its  base  case  (x)  and  its  recursive  case  (/).  The  theorem  is  generic  in 
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X,  which  is  the  target  type  of  a.  For  example,  if  .*»  is  the  function  which  finds  the  size  of  a 
list  then  x  will  be  Ri.  The  fact  that  each  x  and  f  gives  rise  to  a  unique  h  means  there  is  a 
function  (rather  than  a  relation)  h  say,  linking  x  and  f  to  h.  That  is 


H(x,  f)  -  h 

The  function  h  is  the  single  higher  order  function  discussed  above.  The  formal  definition 
of  H  is  derived  from  the  PRT,  by  replacing  h  with  h  {x,  f) .  The  definition  is  as  follows 

p  [X]  ,  : - -  ] 

H  :  (XX  ((X  X  X  LIST)  -*  X))  -*  (LIST  -»  X) 

V  X  ;  X/  f  ;  (X  X  ^  X  LIST)  -*  X  • 

H(x,  f)  nil  -  X 

V  i  ;  1  :  LIST  •  H(x,  f)  (join(i,l))  -  /  (H(x,  f)  1,  i,  1) 


Using  «,  the  functions  size  (the  number  of  elements  in  a  list),  and  the  concatenation  of 
two  lists,  can  be  written 

fl  ==  X  n  ;  KN/  i  :  2;/  1  :  LIST  -  n  +  1 

f2  ==  \  1  :  LIST;  i  :  ZL;  m  :  LIST  •  join(i,  1) 

size  ==  H(0,  fl) 

"  ==  \  I,  w  :  LIST  •  H(m,  f2)  1 


A 

For  free  types,  given  the  PRT,  the  single  higher  order  function  described  above  can  be 
automatically  generated,  and  is  therefore  also  of  interest  to  a  tool  builder.  In  general, 
although  higher  order  functions  make  theorem  proving  easier,  they  make  the  proof  harder 
to  understand.  Quite  often  a  more  obscure  specification  can  lead  to  an  easier  proc'  This 
conflict  between  specification  and  proof  is  discussed  in  section  7. 


3  Induction 

Proofs  by  induction  occur  frequently:  the  proof  of  any  property  defined  over  an  infinite 
set  (for  example  the  natural  numbers)  will  almost  inevitably  involve  induction.  Proof  by 
induction  is  the  basis  of  the  Oyster-Clam  system  [Bundy  a],  which  carries  out  program 
synthesis.  Program  synthesis  is  where  a  program  is  extracted  from  the  proof  of  a  theorem. 
The  theorem  to  be  proved  is 

^  V  inputs  •  3  output  •  spec (inputs,  output)  (A) 

where  spec  (inputs,  output)  is  a  specification  of  the  program.  Theorem  a  says  that 
the  program  is  required  to  produce  an  output  satisfying  the  specification  for  every  input. 
What  is  required  for  the  proof  is  the  construction  of  an  existential  witness  for  theorem  a 
which  will  be  the  program  prog  (inputs)  satisfying  the  theorem 

►  V  inputs  •  spec  (inputs,  prog  (inputs) ) 

Given  that  theorem  a  starts  with  a  universal  quantifier,  a  proof  by  induction  will  usually 
be  appropriate.  Each  step  of  the  proof  corresponds  to  the  introduction  of  a  program 
construct.  For  example,  an  inductive  proof  step  corresponds  to  the  introduction  of  a 
recursive  procedure.  Proving  a  can  involve  complex  induction  schemata,  corresponding 
to  the  program  structure. 

For  any  proof  by  induction,  a  choice  has  to  be  made  of  the  induction  variable  and  the 
induction  schema;  that  is,  what  will  constitute  the  base  and  step  cases.  For  example, 
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proving  properties  of  the  even  numbers  will  usually  involve  stepping  the  induction 
variable  by  two  rather  than  one.  The  induction  schema  together  with  the  induction 
variable  will  be  called  the  induction  strategy.  This  section  presents  a  technique  for 
choosing  the  induction  strategy.  When  the  induction  strategy  has  been  chosen,  the  proof 
must  be  completed,  so  this  section  also  presents  a  technique  for  completing  the  proof. 
The  techniques  presented  here  are  of  interest  to  a  person  carrying  out  a  pen  and  paper 
proof,  because  they  help  to  find  the  right  induction  strategy,  and  to  complete  the  proof. 
They  are  of  interest  to  a  tool  builder  as  both  techniques  can  be  automated. 


3.1  Choosing  the  Induction  Strategy 

The  technique  presented  here  for  choosing  the  induction  strategy  comes  from  the 
Boyer-Moore  theorem  prover  [Boyer].  Firstly,  the  technique  is  explained  when  there  is 
only  one  recursive  function  and  one  possible  induction  variable.  Theorems  involving 
more  than  one  recursive  function  and  more  than  one  possible  induction  variable  are 
discussed  later. 

For  theorems  involving  only  one  recursive  function  and  one  possible  induction  variable, 
then  obviously  that  variable  must  be  chosen.  The  induction  schema  chosen  should  be  that 
one  which  mirrors  the  form  of  recursion  used  to  define  the  function.  For  example,  the 
induction  schema  will  have  the  same  number  of  base  and  step  cases  as  the  form  of 
recursion.  The  next  two  examples  illustrate  the  technique. 

Example  7 

Consider  the  following  recursive  function 


•=1 

X  W;  -»  W 

V  m,  n  :  W  • 

0  +  m  =  m  (1) 

s  (n)  +  m  ==  s  (n  *  m)  (2) 

_ I 


The  theorem 


can  be  proved  using  the  one  step  induction  schema 

V  hiO)  A  (SI) 

V  n  ;  W  •  Pfn;  =»  ;  fs  n; 

V  n  K!l  •  Pfn; 

The  base  case  in  si,  namely  p  to;,  corresponds  to  the  base  case  of  +  (axiom  i).  Similarly 
the  step  case  in  si,  namely 

V  n  .-  W  •  P(n)  P(s  n) 

corresponds  to  the  form  of  recursion  in  the  step  case  of  +  (axiom  2).  A 
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Example  8 

Consider  the  following  recursive  definition  of  the  set  of  even  natural  numbers. 


even 


P 


V  n  :  Rt  • 

0  6  even 
s  (0)  «E  even 

s  n)  €  even  One  even 


(1) 

(2) 

(3) 


The  theorem 


►  Vn  ;R<*ne  even  O  5  (n)  e  even 
can  be  proved  using  the  two  step  induction  schema 

¥  P(0)  A  (S2) 

P(S  0)  A 

V  n  ;  R<  •  P  ^n;  O  P  fs  fs  n; ; 

o 

V  n  •  R<  •  P  Cr ) 

The  two  base  cases  in  S2,  namely  p(0)  and  p(s  O).  correspond  to  the  two  base  cases  in 
the  definition  of  even  (axioms  i  and  2).  The  step  case  in  52,  namely 

V  n  :  W  •  P(n;  O  Pfsts  n;; 

corresponds  to  the  form  of  recursion  in  axiom  3  of  even.  A 

It  is  wonh  saying  at  this  point  that  induction  schemata  si  and  52  are  logically 
equivalent.  It  is  just  that  one  induction  schema  is  more  appropriate  for  a  theorem  than 
another.  Now  suppose  that  a  theorem  involved  more  than  one  recursive  function.  Also 
suppose  the  theorem  contained  ."norc  than  one  possible  induction  variable.  (In  examples  7 
and  8  there  was  only  one  possible  induction  variable,  namely  n.)  Which  induction 
schema,  and  which  induction  variable  should  be  chosen?  The  technique  for  choosing  the 
induction  strategy  is  a*'  'ollows.  The  explanation  is  based  on  that  given  in  (Bundy  b]. 
During  ihe  explanation,  the  pi,,  ase  recursion  term  is  used.  This  is  a  term  such  as  s  (n)  in 
example  7  and  s  (s  n;  in  example  8.  The  recursion  term  appears  in  both  the  definition  of 
a  recursive  function,  and  in  the  corresponding  induction  schema. 

Firstly,  the  form  of  the  theorem  is  analysed  to  produce  a  number  of  raw  induction 
suggestions.  These  suggestions  are  then  combined  to  produce  a  single  induction  strategy 
which  will  be  used  for  the  pre  f.  The  raw  induction  suggestions  are  generated  as  follows. 
The  theorem  is  scanned  to  locate  its  recursive  functions.  Each  occurrence  of  a  recursive 
function  f,  with  a  variable  x  in  its  recursive  argument  position,  produces  a  raw  induction 
suggestion  (the  recursive  argument  position  for  the  function  +  in  example  7  is  its  left 
hand  argument).  The  raw  induction  suggestion  consists  of  the  induction  schema  that 
mirrors  the  form  of  recursion  used  to  define  f,  and  induction  variable  x. 

These  suggestions  are  then  combined  as  follows  It  may  be  possible  for  one  suggestion  to 
subsume  another.  This  will  be  the  case  if  the  two  suggestions  consist  of  the  same 
induction  variable,  and  the  recursion  term  of  one  induction  schema  consists  of  repeated 
n. stings  of  the  recursion  term  of  the  other  schema.  For  example,  the  recursion  term  ^  (s 
n)  in  example  8  subsumes  the  recursion  term  s  <n)  in  example  7.  If  it  is  not  possible  for 
one  suggestion  to  subsume  another,  then  it  may  be  possible  to  produce  a  new  suggestion 
that  subsumes  both.  This  is  achieved  by  merging  the  two  suggestions.  For  example,  a  two 
step  schema  and  a  three  step  schema  can  be  merged  to  give  a  six  step  schema.  After 
subsumption  and  merging  has  been  carried  out  there  will  be  one  suggesti  n  for  each 
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induction  variable.  It  then  remains  to  choose  one  of  these  suggestions  as  the  induction 
strategy  for  the  theorem.  This  final  step  is  achieved  by  considering  the  terms  that  would 
occur  in  the  induction  conclusion,  for  each  suggestion.  Some  suggestions  would  result  in 
terms  that  could  not  be  rewritten  using  the  definitions  of  the  recursive  functions.  Such 
suggestions  zxt  flawed  and  are  removed.  If  more  than  one  suggestion  remains,  the  winner 
is  the  one  that  subsumes  the  largest  number  of  raw  suggestions.  The  next  example 
illustrates  the  technique. 

Example  9 

Consider  the  theorem 

^Vn,  even  a  m  €  even  ^  (n  +  m)  e  even 

where  +  and  even  are  defined  in  examples  7  and  8.  This  theorem  contains  two  recursive 
definitions,  even  and  +,  and  two  possible  induction  variables,  n  and  m.  The  raw  induction 
suggestions  are 


<S2,  n)  (A) 

<S2,  m)  (B) 

(SI,  n)  (C) 

where  si  and  S2  are  the  one  step  and  two  step  induction  schemata  appearing  in  examples 
7  and  8.  Suggestion  a  is  generated  by  the  first  occurrence  of  even  in  the  theorem,  with 
the  variable  n  in  its  recursive  argument  position.  Suggestion  b  is  generated  by  the  second 
occurrence  of  even,  with  the  variable  m  in  its  recursive  argument  position.  Suggestion  c 
is  generated  by  the  occurrence  of  +,  with  the  variable  n  in  its  recursive  argument  position. 

These  three  suggestions  are  combined  to  form  the  induction  strategy  as  follows. 
Suggestion  a  subsumes  suggestion  c,  because  they  both  contain  the  induction  variable  n, 
the  recursion  term  in  S2  is  s(s  n;,  and  the  recursion  term  in  si  is  s<n).  This  leaves 
suggestions  a  and  b.  Suggestion  a  is  flawed  because  it  would  produce  the  term  n  +  s  Cs 
mjin  the  induction  conclusion,  and  this  can  not  be  rewritten  using  the  definition  of  +.  The 
induction  strategy  is  therefore  a;  two  step  induction  on  n.  A 


3.1.1  Strengths  and  Weaknesses 

Using  the  above  technique,  an  induction  strategy  can  be  automatically  generated.  The 
technique  has  been  automated  in  the  Boyer-Moore  theorem  prover  [Boyer],  and  in  the 
Oyster-Oam  system  [Bundy  bj.  But  the  generated  induction  strategy  might  not  always  be 
appropriate  to  prove  the  theorem.  Once  again,  the  technique  is  still  safe,  it  just  means  that 
the  user  will  not  be  able  to  prove  the  theorem  using  that  induction  strategy.  The  next 
example  illustrates  this. 

Example  10 

Consider  the  theorem 


►  V  m,  n  ;  RIj  •  ^  2*n^ 

Suppose  the  two  functions  that  appear  in  this  theorem  (multiplication  and  expionentiation) 
arc  defined  recursively  (which  is  usually  the  case).  The  above  technique  would  produce 
an  induction  schema  based  on  these  two  recursive  functions.  But  the  appropriate 
induction  schema  for  this  theorem  is 

►  V  n  ;  W  •  tV  n  .-  Rl  I  n  <  n  •  h(n))  =>  P(n) 

V  n  :  fH  •  P(n) 
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This  is  a  case  of  Noetherian  induction.  This  induction  schema  says  that  p  (n)  must  be 
proved  under  the  assumption  that  p  holds  for  all  values  less  than  />.  That  is,  the  induction 
conclusion  p  (n)  must  be  proved  from  the  induction  hypothesis 

V  n'  :  W  /  n'  <  n  •  P  (n' ) 

Basically,  this  means  that  for  the  above  theorem,  *  2*n^  must  be  proved  under  the 
assumption  that  *  2*n^  is  true  for  all  smaller  values  of  m  and  n.  This  is  achieved  by 
assuming  n?  -  2*n^  (the  negadon  of  the  induction  conclusion)  and  obtaining  a 
contradiction.  The  contradiction  arises,  because  in  assuming  -  2*n^,  it  can  be  proved 

that  -  2*n'^,  where  m'  and  n'  arc  smaller  than  m  and  n.  This  contradicts  the  induction 
hypothesis.  A 

3.2  Completing  the  Proof 

Once  the  induction  strategy  has  been  chosen,  the  proof  must  be  completed.  In  [Bundy  a] 
a  technique  is  described  for  manipulating  the  induction  conclusion  so  that  the  induction 
hypothesis  can  be  used.  This  technique  is  called  rippling-out .  The  difference  between  an 
induction  hypothesis  tnd  conclusion  is  in  the  recursion  term  (recursion  terms  were 
described  in  section  3.1).  For  example,  the  induction  strategy  finally  chosen  in  example 
9,  used  the  recursion  term  s(s(n)).  The  induction  conclusion  contains  s(s(n)) 
wherever  the  induction  hypothesis  contains  just  n.  The  expression  s(s  (...))  (the 
recursion  term  without  the  n)  is  an  example  of  a  wave  front.  To  complete  the  proof,  the 
wave  front  must  be  moved  outwards  from  its  deeply  nested  positions  (or  rippled,  like  a 
wave  on  a  pond),  to  reveal  a  copy  of  the  induction  hypothesis.  The  next  example 
illustrates  rippling-out. 

Example  11 

Consider  the  function  +  as  described  in  example  7,  namely 


(1) 

(2) 

Proving  the  associativity  of  +,  namely 

►  V  X,  y,  z  :  •  (x  +  y)  +  z  =  x  +  (y  +  z) 

by  induction  on  x  gives  an  induction  hypothesis 

(x  y)  +  z  «  X  +  (y  z) 

and  an  induction  conclusion 

(5  (x)  y)  +  z  =  s  (x)  +  (y  +  z) 

Using  repeated  applications  of  axiom  2,  the  wave  front  s(. . .)  can  be  rippled  outwards 
to  give 

s  ( (X  *  y)  ^  z)  “Six  *  iy  +  z) )  ih) 

Since  s  is  an  injective  function,  it  obeys  the  law 
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s(n))  -  (m  -  n) 

for  any  namral  numbers  m  and  n.  Using  this  law,  expression  a  above  can  be  rewritten  as 

(x  +  y)  +  z  -  X  +  (y  *  z) 

The  wave  front  s(...)  has  now  completely  rippled-out,  revealing  a  copy  of  the 
induction  hypothesis.  The  proof  is  therefore  complete.  A 


3,2.1  Strengths  and  Weaknesses 

The  technique  of  rippling-out  is  easy  to  apply.  The  user  simply  has  to  apply  axioms  so 
that  wave  fronts  in  ^e  induction  conclusion  move  outwards.  The  technique  has  been 
automated  in  the  Oyster-Clam  system  [Bundy  a],  llie  technique  is  used  by  Oyster-Clam 
as  a  central  strategy  for  proof.  In  this  system,  not  only  does  the  technique  complete  an 
induction  proof,  it  also  plays  a  pan  in  choosing  the  induction  strategy.  This  is  because  the 
technique  for  choosing  the  induction  strategy  (section  3.1)  has  been  automated  in 
Oyster-Cam  so  as  to  look  ahead  to  ensure  subsequent  rippling  can  proceed.  But  the 
technique  of  rippling-out  is  not  always  appropriate.  For  example,  sometimes  a  wave  front 
needs  to  move  sideways  instead  of  outwards.  That  is,  the  wavefront  remains  at  the  same 
level  of  nesting.  The  next  example  illustrates  this. 

Example  12 

Consider  the  function  grev  in  example  2.  One  of  the  axioms  of  grev  is 

qrev({x)~s,  t)  »  qrevis,  (x>“t; 

Using  this  axiom  in  a  proof  will  move  the  wavefront,  (x>“,  sideways.  The  wavefront  is  at 
the  same  level  of  nesting  on  both  sides  of  the  above  axiom.  A 

Extensions  to  rippling-out  have  been  made  in  [Bundy  c].  They  include  rippling  sideways 
as  described  above,  but  are  beyond  the  scope  of  this  report.  The  extensions  have  been 
automated  in  Oyster-Clam. 


4  Window  Inference 

Window  inference  is  a  style  of  reasoning  which  enables  an  expression  to  be  transformed 
by  restricting  attention  to  a  subexpression.  This  is  called  opening  a  window  on  the 
subexpression.  Window  inferencing  transforms  an  expression  without  affecting  the  rest 
of  the  expression,  but  allows  contextual  information  to  be  used  while  transforming  the 
subexpression.  The  contextual  information  is  derived  from  the  original  expression  minus 
the  subexpression.  This  section  formalizes  window  inference  so  that  it  can  be  automated 
and  is  therefore  of  interest  to  a  tool  builder.  The  formalization  presented  here  is  from 
[Grundy]. 


4.1  Formalizing  Window  Inference 

It  is  useful  to  see  an  example  of  window  inference  before  anempting  to  formalize  it.  The 
next  example  illustrates  the  use  of  window  inference  to  simplify  an  expression. 

Example  13 

Suppose  the  expression 

{head  s)  "  (tall  s;  ”  t  a  s  ^  0 
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where  s  and  t  are  sequences,  is  to  be  simplified.  Consider  the  following  law  from 
[Spivey] 

►  3*0=^  {head  s)  ~  (tail  s)  -  s  (B) 

How  can  this  law  be  used  to  simplify  expression  a?  Window  inference  allows  a  window 
to  be  opened  (shown  as  a  box  below)  on  a  subexpression  of  a 


I  {head  s)  '  (tail  s)  -  t  |  a  s  ^  0 


The  contextual  information,  s  ^  0  (die  remainder  of  a)  can  then  also  be  used  to  simplify 
the  window  expression.  This  contextual  information,  together  with  theorem  s,  enables  the 
following  fact  to  be  deduced  (by  modus  ponens) 

{head  s)  ~  (tail  s)  •  s 

Rewriting  with  this  new  fact  allows  the  window  expression  to  be  simplified,  so  that  the 
original  expression  becomes 

S  =  t  /\  S  *  0 

A 

Window  inference  can  be  formalized  by  using  window  rules.  A  window  rule  is  a 
panicular  type  of  inference  rule.  For  example,  the  window  inference  carried  out  in  the 
last  example  can  be  formalized  with  the  window  rule 

0,  r  h  p  p' 

r  ►  fp  A  o;  «  tp'  A  Q) 

In  this  rule  the  complete  expression  is  p  a  q,  the  window  expression  is  p,  and  the 
contextual  information  is  Q.  The  rule  states  that  if  the  window  expression  can  be 
simplified  to  p'  (using  the  contextual  information)  then  the  complete  expression  can  be 
simplified  to  p'  a  q.  The  symbol  r  denotes  a  list  (possibly  empty)  of  other  facts  that  can 
be  used  when  simplifying.  In  the  last  example  r  consisted  of  theorem  s.  The  general 
form  of  a  window  rule  is 

y,  r  ^  e  r  e’ 
r  h  E(e]  R  E[e] 

The  complete  expression  is  E[e]^  the  window  expression  is  e,  and  the  contextual 
information  is  y.  The  transformed  window  expression  is  e'  which  means  that  the 
complete  expression  is  transformed  to  Eie).  The  symbol  r  denotes  a  list  (possibly 
empty)  of  additional  facts  that  may  be  used  during  the  transformation.  The  relationship 
between  the  original  and  transformed  window  expression  is  r.  Similarly,  the  complete 
expressions  before  and  after  transformation  arc  related  by  r.  In  general  r  and  r  are 
different  as  illustrated  in  the  next  example. 

Example  14 

Consider  the  formal  refinement  of  a  specification  to  code  (see  for  example  [Morgan]). 
When  strengthening  the  postcondition  it  is  natural  to  do  so  under  the  assumption  that  the 
precondition  holds.  This  can  be  formalized  using  the  window  rule 

pre,  r  ►  post  <=  post’ 

r  ►  w  ;  [pre,  post]  i  w  :  [pre,  post'] 
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The  symbol  e  is  the  refinement  relation  and  w  is  the  list  of  variables  whose  values  may 
change.  In  this  example  r  is  "reverse"  implication  and  r  is  refinement.  Therefore  when 
carrying  out  a  formal  refinement  of  the  statement 

w  :  [pre,  post] 

a  window  may  be  opened  on  the  subexpression  post  as  shown 


w  :  [pre,  |  post  |  J 


while  assuming  the  contextual  information  pre.  A 


4.2  Opening  a  Window  Within  a  Window 

A  window  can  be  opened  within  a  window  and  so  on,  creating  a  window  stack.  To 
formalize  this  consider  the  pair  of  relations  (r,  r)  appearing  in  a  general  window  rule. 
A  window  may  be  opened  inside  another  provided  that  r  for  the  outer  window  is  the 
same  as  r  for  the  inner.  The  next  example  illustrates  this. 

Example  15 

Consider  a  backwards  (subgoaling)  style  of  proof.  A  goal  of  the  form 

X  e  (  D  I  P  •  u  } 

can  often  be  solved  by  strengthening  the  predicate  p  to  p'.  This  can  be  achieved  in  a 
series  of  window  inference  steps  as  follows.  First  of  all  a  window  is  opened  on  the  set 
using  the  window  rule 


r  I-  5  2  s' 

rt’xe  S^xe  s' 

This  rule  states  that  the  set  must  be  transformed  into  a  new  set  with  less  elements.  This  is 
achieved  by  opening  a  window  on  the  subexpression  p  using  the  window  rule 

£>,  r  ^  p  «=  p' 

^^{D|P•u)^{D|P'•u} 

and  strengthening  the  predicate  p  (to  give  a  new  predicate  p').  Once  this  has  been 
achieved  (by  opening  another  window  perhaps,  or  using  a  theorem),  both  windows  can  be 
closed  to  give  the  new  subgoal 


X  e  {  D  I  P'  •  u  } 


A 
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4  J  Straigths  and  Weaknesses 


Window  inference  is  a  powerful  technique  as  it  allows  the  user  to  concentrate  on  a 
subexpression,  and  assume  contextual  information  while  doing  so.  There  are  many  areas 
where  window  inference  is  useful,  for  example  the  simpliflcadon  of  expressions, 
refinement,  and  goal  directed  theorem  proving.  A  disadvantage  is  that  the  window  rules 
which  formalize  the  technique  are  complicate  If  window  inference  is  automated,  then 
although  window  rules  are  complicated  they  will  be  hidden  from  the  user.  The  tool  will 
allow  a  window  to  be  opened  up  provided  diat  it  has  a  window  rule  to  justify  it.  Window 
inference  has  been  automated  in  the  HOL  theorem  prover  [Grundy]. 


5  Consistency  of  Z  Specifications 

It  is  important  that  a  Z  specification  is  consistent.  An  inconsistent  specification  may  lead 
to  false  conclusions  in  reasoning  thus  destroying  the  point  of  having  a  specification.  For 
example,  a  specification  of  a  natural  number  x  using  the  axiom  x  -  x  +  i  is 
syntactically  correct  and  well  typed,  but  no  such  x  exi.>ts.  In  general,  for  each  object 
specified,  a  theorem  should  be  proved  stating  that  the  object  exists. 

One  way  in  which  inconsistencies  can  arise  is  when  using  a  Z  free  type.  [Smith  a]  shows 
that  a  user  defined  recursive  free  type  may  not  exist,  or  even  if  it  does,  a  recursive 
function  defined  over  it  may  not.  [Smith  a)  presents  techniques  for  checking  the 
consistency  of  recursive  free  types,  and  recursive  functions  defined  over  them.  This 
section  extends  the  technique  so  that  a  wider  class  of  functions  can  be  checked.  The 
extension  is  of  interest  to  a  person  carrying  out  a  pen  and  paper  proof  so  as  to  avoid  false 
conclusions  when  reasoning.  It  is  of  interest  to  a  tool  builder  as  the  extension  can  be 
automated. 


5.1  Functions  Defined  Over  Free  Types 

[Smith  a]  presents  a  technique  to  prove  the  existence  of  a  recursive  function  defined  over 
a  recursive  free  type.  But  the  technique  can  only  be  used  for  a  function  defined  by 
primitive  recursion.  The  technique  is  now  extended  to  cover  some  non-primitive 
recursive  functions.  Basically  the  idea  is  to  rewrite  the  non-primitive  definition  in  terms 
of  a  primitive  one,  after  which  the  technique  in  [Smith  a]  may  be  used.  The  reader  needs 
only  to  appreciate  how  a  non-primitive  function  can  be  rewritten  in  terms  of  a  primitive 
one.  The  extension  only  covers  non-primitive  functions  defined  over  a  free  type  of  the 
form 


T  al  I  ...  /  am  /  cJ  «  T  »  /  ...  /  c/i  «  T  » 

(the  basic  technique  in  [Smith  a]  covers  ail  free  types).  A  function  f  defined  by  primitive 
recursion  over  the  above  free  type  t  has  a  recursive  case  of  the  form 

f(ci  t)  •  At f  t) 

where  A(f  t)  is  some  expression  involving  f  t.  A  non-primitive  function  g might  have 
a  step  case  of  the  form 

g(cl(c2  t))  =  B{  gtcl  t},  g(c2  t),  g  t  )  (1) 

In  general,  if  there  are  n  constructors  d  on  the  left  hand  side  of  i  then  the  right  hand  side 
will  contain  applications  of  g  containing  zero,  one,  two,  ...,  n-i  constructors.  Such  a 
function  g  can  be  written  in  terms  of  a  primitive  recursive  function  gi.  The  idea  is  that  gi 
t  delivers  the  tuple 
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(g  t, 

g(cl  t),  ,  gicn  t), 

applications  of  g  with  two  constrvctois 


applications  of  g  with  n-1  constuctors 

(this  tuple  will  contain  i  *  n  +  n^  +  ...  +  n"~^  elements).  It  is  then  the  case  that 
gi(ci  t;  can  be  wrinen  in  terms  of  gi  t  (note  this  is  primitive  recursion).  Once  gi  has 
been  constructed  the  original  function  9 can  be  written  g  t  ~  rsT(gi  t),  where  fst 
projects  the  first  element  from  the  tuple.  The  next  two  examples  illustrate  the  technique. 

Example  16 

The  natural  numbers  can  be  considered  as  the  free  type 

W  0  I  s  «  W  » 


Consider  the  function 


fib 

:  — -r,  -  ==q 

:  R(  ^  Rt 

V  n 

.-  KN  • 

fib  0  =  0 

(1) 

fib  1=1 

(2) 

fib(n+2)  =  fib(n+l)  +  fib(n) 

1 

(3) 

which  generates  the  Fibonacci  numbers.  The  function  fib  is  defined  by  non-primitive 
recursion.  A  primitive  recursive  function  fibi  is  now  constructed.  The  idea  behind  the 
construction  is  that 


fibl  n  *  (fib  n,  fib(n+l))  (A) 


Using  this  idea 


fibl  0 

=  (fib  0,  fib  1)  (using  A) 

=  (0,  1)  (using  axioms  1  and  2  of  fib) 

and 

fibl  (n+1) 

=  (fib(n-H),  fib(n+2)) 

»  (fib(n-H),  fib(n+l)  +  fib(n)) 

=  CA.  X,  y  :  Rt  •  (y,  y+x))  (fib  n,  fib(n+l)) 

=  (\  X,  y  ;  •  (y,  y+x) )  (fibl  n) 

The  formal  definition  of  fibl  is  therefore 

A  —  X  X,  y  .-  Rt  •  (y,  y  +  x) 


(using  A) 

(using  axiom  3  of  fib) 
(^-abstraction) 

(using  A) 


-  =1 

fibl  .'  Rf  -»  (in  X  HI) 


V  n  .•  Rt  • 

fibl  0  ~  (0,  1) 
fibl(n+l)  -  h(fibl  n) 
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Notice  that  this  is  a  priniitive  recursive  function  since  fibi  (n-n)  is  defined  in  terms  of 
fibi  (n).  The  origin^  funcdon  fib  can  then  be  written 

fib  n  -  fst  (fibi  n)  (B) 

The  above  approach  of  using  the  funcdon  fibi  is  also  used  in  producing  the  fast 
Fibonacci  function  in  fiincdonal  programming,  described  in  [Bird].  The  deBnidon  b, 
considered  as  a  funcdonal  program,  is  more  efficient  than  the  original  deflnidon,  taking 
less  rewrites  to  evaluate  the  nth  Hbonacci  number.  A 

Example  17 

Consider  the  free  type 

T  a  /  b  «  T  »  !  c  (t  T  )) 


and  the  non-primidve  recursive  funcdon 


5.2  Strengths  and  Weaknesses 


The  technique  described  helps  in  checking  the  consistency  of  a  Z  sp^ification.  The 
technique  of  rewriting  a  non-primitive  recursive  function  in  teims  of  a  primitive  one  can 
be  automated.  A  disadvantage  of  the  technique  is  that  the  definition  of  the  non-primitive 
recursive  function  must  be  rewritten  in  an  obscure  way.  This  obscure  specification  helps 
to  prove  the  existence  of  the  function.  Hiis  is  another  example  of  how  a  more  obscure 
specification  can  lead  to  an  easier  proof.  This  conflict  between  specification  and  proof  is 
discussed  in  section  7. 


6  Reasoning  with  Schemas 

In  Z,  it  is  useful  to  reason  at  the  schema  level,  without  getting  lost  in  a  mass  of  low  level 
predicates.  A  schema  can  appear  as  a  set,  a  predicate,  in  a  schema  expression  or  as  an 
inclusion.  In  its  last  role  it  provides  its  main  expressive  power,  but  also  the  greatest 
difficulty  for  reasoning  as  it  is  not  obvious  when  to  stop  expanding  schema  inclusions. 

This  topic  will  not  be  dealt  with  here  except  to  note  that  the  main  new  feature  introduced 
by  a  schema  is  its  signature,  so  proof  rules  need  to  be  concerned  with  the  effect  of 
binding  and  variable  introduction.  In  the  case  of  schemas  as  predicates  and  schema 
expressions  it  is  reasonable  to  assume  that  typechecking  has  discharged  the  scope  and 
type  consistency  obligations,  in  which  case  schemas  can  be  handled  as  predicates  and 
simple  laws  given  for  the  schema  operators.  The  propositional  schema  operators  (a,  v 
etc)  obey  all  the  usual  laws  of  propositional  logic.  If  s,  t  and  u  are  schemas  then 

5  a  (T  U)  -  (S  /k  T)  (S  A  vj 

Similar  laws  can  be  given  for  the  other  schema  operators,  and  as  an  illustration  of  the 
laws  useful  for  reasoning  at  the  schema  level,  the  laws  appropriate  for  preconditions  will 
be  given. 


6.1  Laws  For  Calculating  Preconditions 

When  describing  an  operation  in  Z,  it  is  useful  to  know  when  the  specified  operation  can 
be  used.  In  panicular,  for  consistency,  it  is  important  to  check  that  the  domain  of 
applicability  is  not  empty.  When  using  a  schema  to  describe  an  operation,  the 
applicability  of  the  operation  is  described  by  its  precondition.  This  section  contains 
useful  laws  for  calculating  the  precondition.  The  laws  presented  here  have  been 
rigorously  proven  by  hand  ^though  they  should  be  formally  proved  using  the  semantics 
of  Z  [Brien].  The  work  extends  that  in  [Gilmore]  to  consider  all  the  schema  operators  in 
[Spivey]  and  [McMorran]. 

Some  of  these  rules  have  side  conditions  which  are  expressed  in  square  brackets  directly 
after  the  rule.  The  following  notation  is  used  to  express  these  side  conditions.  If  5  is  a 
schema,  let  bs  denote  the  set  of  "before"  components  of  s  (the  undashed  variables  and 
the  inputs),  and  as  denote  the  set  of  "after"  components  of  s  (the  dashed  variables  and 
the  outputs).  Also  let  !s  denote  just  the  outputs.  The  next  example  illustrates  this 
notation. 
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Example  18 
If  5  is  the  schema 


- - , 

X,  X ,  z!  :  Zl 

Yt  y' t  •*  W 


then 


hS  -  {X,  y,  I/?;  as  -  lx' .  y’ ,  z!}  !S  -  {z!} 

A 

Also,  the  rules  are  general  in  the  sense  that  there  is  no  necessity  for  a  schema  to  be  of  the 
form 


AState 


j 


that  is,  where  every  undashed  variable  has  a  dashed  counierpan.  The  schema  could  have 
an  undashed  variable  with  no  dashed  counterpart,  or  vice  versa.The  rules  are  listed  below 
under  the  particular  schema  operation  involv^. 


Disjunction 


Conjunction 


pre  (S  V  T)  =  pre  (S)  v  pre  (T) 


pre(S  A  r;  =  pre(S)  a  pre(T)  [aS  n  aT  =  {)] 


Implication 


pre(S  T)  =  pce(—>S)  v  pre  (T) 


Equivalence 

pre(S  ^  T)  =  pre(S  t;  a  pre  (T  ^  S)  laS  n  aT  =  { }  ] 

TTiis  is  a  very  interesting  law.  At  first  sight,  this  law  appears  to  follow  easily  from  the  law 
for  conjunction  above,  since 

5»T  =  5=»TaT=>S 

But  in  order  to  use  that  law  its  side  condition  must  be  satisfied  for  the  schemas  s  =»  t 
and  T  5.  But  these  two  schemas  have  the  same  signature  (formed  from  merging  the 
signatures  of  5  and  t).  Thus  the  side  condition  is  not  satisfied,  and  so  the  law  can  not  be 
used.  The  above  law  for  schema  equivalence  must  be  derived  by  other  means.  It  can  be 
used  with  the  law  for  implication  to  obtain  an  expression  for  pre  (S  «  t;  in  terms  of  the 
preconditions  of  s,  r,  -.s  and  t. 
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Negation 


There  is  no  useful  law  for  pre  (->s).  At  first  glance,  one  might  have  expected  pre  (-<s)  to 
be  equal  to  (pre  S)  but  this  is  not  the  case.  A  counterexample  follows. 

Example  19 

Let  s  be  the  schema 


Both  pre  sand  pre  r->s>  are  the  schema 


X  :  2 


true 

■ 


Thus  (pre  S)  is  the  schema 


I 

X  ;  ^ 


false 


which  is  very  different  from  pre  c-.s; .  A 

The  reason  why  pre  f-.s;  is  not  equal  to  (pre  S)  is  as  follows.  A  precondition 
describes  the  applicability  of  an  operation  rather  than  describing  the  operation  itself. 
Thus  it  is  possible  for  two  schemas  to  have  very  different  properties,  but  the  same 
precondition  (as  seen  for  the  two  schemas  s  and  in  the  above  example). 

Overriding 


pre(S  ®  T)  =  pre  (5)  v  pre  (T) 

This  law  is  similar  to  the  law  for  disjunction  because  s  ©  r  is  like  5  v  r  but  with  the 
priority  given  to  t.  This  priority  does  not  affect  the  precondition. 

Composition 


prefS  ;  T)  =  prefS  ;  pre(T))  [!S  n  IT  »  {)] 

Recall  that  for  5  !  r  to  be  defined  the  set  of  dashed  variables  of  s  must  equal  the  set  of 
undashed  variables  of  r. 

Piping 


pre(S  »  T;  «  pre(S  »  pre(T))  I  (aS  -  pipe)  n  aT  =  {)] 

where  pipe  is  the  set  of  outputs  of  s  that  are  piped  into  r.  This  law  is  similar  to  that  for 
composition,  but  with  ;  replaced  by  ».  This  is  because  the  operation  of  piping  is  similar  to 
composition,  but  with  inputs  and  outputs  forming  the  interface  between  s  and  r,  rather 
than  dashed  and  undashed  variables. 
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Projection 

pretS  \  T)  -  (pre<S)  a  pre(T))\fbS  -  bT)  [aS  n  aT  -  (}] 

where  bs  -  bT  is  the  list  of  variables  that  appear  in  bs  but  not  in  br. 

Restriction 

pre  [SIP]  -  [pre(S)  IP]  [P  is  a  predicate  over  bS] 

Quantification 

preODIP^S)  -  ipre  [SIP])\(D  -  aS) 

where  (d  -  aS)  is  the  list  of  variables  that  appear  in  the  declaration  d  but  not  in  as. 
There  are  no  useful  rules  for  pre  (3^dip‘S)  and  pre  (Vdip^s). 

Hiding 

pre<S\  (V)  )  =  (pre  S)  \  (v  -  aS) 

where  v  is  a  list  of  variables  that  appear  in  s.  The  expression  (v  -  aS)  is  the  list  of 
variables  that  appear  in  v  but  not  in  aS. 

The  next  example  illustrates  the  use  of  the  above  rules. 

Example  20 

Let  svbio  and  Sub4  be  schemas  describing  the  operations  of  subtracting  lO  and  4 
respectively. 


The  laws  can  be  used  to  calculate  the  precondition  of  the  schema  pre  (Subio  »  sub4). 
Using  the  law  for  piping,  then 

prelSublO  »  Sub4;  =  pre(SublO  »  pre(Sub4))  (A) 

Now  pre  (Suh4)  is  the  schema 


which  simplifies  to 


The  schema  Subio  »  pre(Sijb4)  is  therefore 
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a?  : 


SbzfU^b  -  a?  -  10  r^bi:4 

■ 


which  simplifies  to 


■  I 

a?  ; 


a?  2  14 

■ 


(B) 


Using  A  and  the  fact  that  schema  b  has  no  after  components,  prefSubio  »  sub4)  is 
equal  to  schema  b.  The  calculated  precondition  is  as  expected,  since  subio  »  Sub4 
represents  the  operation  of  subtracting  14.A 


6.2  Strengths  and  Weaknesses 

The  rules  allow  reasoning  at  the  schema  level.  Also  if  an  operation  has  been  specified  in 
terms  of  basic  operations,  using  many  schema  operators,  then  the  rules  can  be  used  to 
calculate  its  precondition  from  the  preconditions  of  the  basic  operations.  This  means  that 
the  preconditions  of  the  basic  operations  can  be  reused,  thus  avoiding  duplication  of 
effort.  The  rules  could  be  used  by  a  tool  builder  to  populate  a  library,  or  be  used  to 
produce  an  automatic  precondition  calculator. 


7  The  Conflict  Between  Specification  and  Proof 

It  is  sometimes  easier  to  prove  a  theorem  staning  from  one  specification  than  from 
another.  Although  it  can  be  easier  to  start  from  one  specification,  this  can  be  at  the 
expense  of  clarity.  If  the  specification  is  hard  to  understand,  then  it  is  unclear  exactly 
what  has  been  specified.  It  is  unclear  if  the  specification  captures  the  author’s  intentions. 
This  in  turn  makes  it  unclear  exactly  what  has  been  proved.  The  whole  point  in  carrying 
out  proof  is  to  increase  confidence  in  the  system  being  developed.  If  it  is  unclear  exactly 
what  has  been  proved  then  the  proof  is  pointless.  The  next  example  illustrates  this. 

Example  21 

Consider  the  highest :  'mmon  factor  (hef)  of  two  non-zero  natural  numbers.  Recall  that 
the  hef  of  two  numbers  is  the  largest  number  which  divides  both.  For  example  the  hef  of 
6  and  9  is  3.  Consider  the  following  specification  of  hef  (which  is  basically  Euclid’s 
algorithm  for  calculating  the  hef). 


-  I 

hef  :  tKJj  X  Wj;  Wj 


V  X,  y  :  • 

hcf(x,x)  •  X 

heftx  +  y,  y)  -  hcf(x,y) 

hef  (X,  X  *  y)  “  hef(x,y) 


With  this  specification  theorems  involving  hef  arc  easy  to  prove.  The  reason  is  that  the 
specification  lends  itself  to  proof  by  induction.  The  particular  induction  schema  required 
is  generated  by  section  3  (induedon),  and  is 
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I'  V  X  ;  Wj  •  P(x,x)  A 

V  X,  y  ;  Wj  •  P(x,y)  P  (x  +  y,  y)  a 

V  X,  y  ;  ^2  *  P(x,y)  ^  P  (x,  x  +  y)  a 

V  X,  y  ;  Wj  •  P(x,y) 

The  problem  is  that  the  above  specification  and  corresponding  induction  schema  are  hard 

to  understand.  It  is  not  obvious  that  the  specification  captures  the  meaning  of  hcf.  Now 
consider  the  following  alternative  specification  of  hcf. 


hcf  :  (Wj  X  Wj;  Wj 

X,  y,  z  :  Wj  • 
nc/^X/y>  divides  x 
hcf(x,y)  divides  y 

z  divides  x  a  z  divides  y  ^  z  divides  hcf(x,y) 


where  m  divides  n  if  and  only  if  n/m  is  a  natural  number.  Proving  theorem:  is  more 
difficult  using  this  specification,  but  the  specification  is  easier  to  understand.  For 
example,  the  first  two  axioms  state  that  the  hcf  of  two  numbers  divides  both  numbers, 
while  the  third  axiom  states  that  the  hcf  is  the  largest  such  number.  It  is  much  easier  to 
see  that  this  second  specification  captures  the  meaning  of  hcf.  A 

As  mentioned  earlier,  if  a  specification  is  hard  to  understand  then  it  is  unclear  whether  it 
captures  the  author’s  intentions.  There  is  a  real  danger  that  it  specifies  something  very 
different.  It  is  possible  for  two  specifications  to  be  quite  similar,  but  specify  very 
different  objects.  Thus  if  a  mistake  has  been  made  in'an  unclear  specification  it  will  be 
hard  to  notice,  and  something  very  different  will  be  specified.  This  in  turn  means  that  a 
different  theorem  to  the  intended  theorem  is  being  proved.  The  next  example  illustrates 
this  point. 

Example  22 

Suppose  a  mistake  has  been  made  in  the  unclear  specification  of  hcf  in  example  21,  giving 


hcf 

:  <^2 

■■  ■  1 

X  ^2^ 

V  X 

,  y  :  Wj  • 

hcf  (x. 

X)  - 

X 

hcf  (x 

+  y> 

y>  •  hcf(x,y) 

hcf  (x. 

X  + 

y)  ^  X 

(the  right  hand  side  of  the  third  axiom  is  different).  This  specification  does  not  specify 
highest  common  factor  at  all,  it  specifies  a  modulo  function.  For  any  two  numbers  x  and 
y  the  above  function  repeatedly  subtracts  y  trom  x  until  it  is  in  the  range  2.  .y.  Such  a 
mistake  is  more  difficult  to  find  in  an  unclear  sjjecification.  Thus  theorems  proved  using 
the  above  specification  are  not  theorems  about  highest  common  factor  at  all,  they  are 
theorenis  about  a  modulo  function.  A 

Other  examples  of  this  conflict  between  specification  and  proof  have  appeared  in  earlier 
sections  of  this  paper.  In  section  3  (induction),  higher  order  functions  were  used  to  make 
theorem  proving  easier.  But  these  functions  made  the  specification  and  proof  harder  to 
understand.  In  section  5  (consistency  of  Z  specifications),  some  recursive  functions  were 
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rewrinen  to  make  it  easier  to  prove  their  existence.  The  new  versions  were  harder  to 
understand. 

[Graven]  and  [Macdonald]  present  techniques  for  writing  a  specification  so  that  it  is 
easier  to  undersund.  For  example  [Gravell]  uses  the  phrase  syntactic  gap  to  mean  the 
difference  between  the  English  and  the  mathematics  in  a  specificadon.  The  idea  is  to 
minimise  the  syntactic  gap.  There  is  of  course  an  assumption  here  that  a  person’s  English 
is  intuitive  and  easy  to  understand. 

There  appears  to  be  a  conflict  between  specification  and  proof.  But  perhaps  this  conflict 
can  be  turned  to  advantage.  Why  not  have  both  the  clear  specification  for  understanding, 
and  the  unclear  specification  for  proof?  The  equivalence  of  these  two  specifications  must 
be  proved:  firstly  to  ensure  that  properties  of  the  "proof  specification  are  indeed 
pro^rties  of  the  "clear"  specification;  secondly  to  ensure  that  all  properties  of  the  "clear" 
specification  can  be  derived  from  the  "proof  specification.  In  example  21,  the 
equivalence  between  the  two  specifications  can  be  proved  as  follows.  The  induction 
schema  given  in  example  21  can  be  used  to  show  that  the  "proof  specification  implies 
the  "clear"  specification.  To  show  the  converse,  the  lemma 

►  V  X,  y  :  Wj  •  (x  divides  y  ^  y  divides  x)  (x  =  y) 

can  be  used  to  link  the  relation  divides  in  the  "clear"  specification  to  equality  in  tlie 
"proof  specification.  In  general,  perhaps  equivalence  could  be  proved  by  a  trusted 
transformation  approach.  This  approach  would  be  similar  to  the  technique  of  program 
transformations. 


8  Conclusions 

This  paper  has  presented  a  number  of  useful  techniques  for  proving  theorems  in  Z.  Some 
of  the  techniques  give  the  user  an  advantage  when  proving  theorems.  For  example, 
generalization  can  make  a  theorem  easier  to  prove,  as  well  as  making  it  more  useful.  Also 
the  technique  for  choosing  the  right  induction  schema  and  induction  variable,  and  the 
technique  for  finishing  the  proof,  help  to  guide  the  user.  One  of  the  techniques  for 
generalizing  a  theorem  is  to  use  higher  order  functions.  The  identification  of  higher  order 
functions  and  theorems  involving  them  presents  an  opportunity  for  further  work  in  this 
area. 

The  paper  has  discussed  the  conflict  between  specification  and  proof.  Sometimes  a  less 
intuitive  specification  can  be  better  for  proof,  but  harder  to  understand.  This  conflict  can 
perhaps  be  turned  to  advantage  by  having  both  specifications.  One  specification  would  be 
for  understanding,  the  other  for  proof.  The  equivalence  of  these  two  specifications  must 
be  proved:  firstly  to  ensure  that  properties  of  the  "proof  specification  are  indeed 
properties  of  the  "clear"  specification;  secondly  to  ensure  that  all  properties  of  the  "clear" 
specification  can  be  derived  from  the  "proof  specification.  Ptoving  the  equivalence 
between  these  two  specifications  is  an  area  for  further  work.  Perhtps  equivalence  could 
be  proved  by  a  trusted  transformation  approach.  This  approach  would  be  similar  to  the 
technique  of  program  transformations. 

The  paper  has  also  discussed  the  important  topic  of  consistency  of  Z  specifications.  A 
technique  is  presented  that  can  check  the  consistency  of  certain  non-primitive  recursive 
functions  defined  over  a  recursive  free  type.  This  technique  should  be  extended  to  cover  a 
wider  class  of  such  functions  and  free  types.  This  is  an  area  where  further  work  is  needed. 
The  laws  for  calculating  preconditions  are  also  useful  when  checking  consistency.  They 
can  be  used  to  check  that  a  specified  operation  is  actually  possible. 

The  paper  has  identified  some  useful  theorems  for  a  theorem  library.  Examples  of  such 
the  -rems  are  those  generated  by  generalization.  These  theorems  could  be  reused  for 
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many  different  problems.  Also,  the  laws  for  calculating  preconditions  would  be  a  useful 
addition  to  such  a  library. 

Some  of  the  techniques  could  be  automated.  The  resuldng  proof  tool  would  be  tailored  to 
the  user,  rather  than  the  reverse  which  is  currently  the  case.  For  example,  if  window 
inference  is  automated  this  will  give  users  the  chance  to  do  on  a  machine  what  they  do  on 
paper.  Using  the  techniques  for  generalizadon,  a  tool  could  automadcally  generalize  a 
theorem  by  replacing  every  occurrence  of  a  subterm  by  a  variable.  If  the  tool  was 
successful  in  proving  the  generalized  theorem  it  could  automadcally  replace  the  variable 
with  die  original  subterm,  thus  proving  the  original  theorem.  The  technique  for  choosing 
the  right  inducdon  schema  and  inducdon  variable,  and  the  technique  for  finishing  the 
proof,  would  mean  a  tool  could  automadcally  attempt  to  prove  a  theorem  by  inducdon. 
Tlie  laws  for  calculating  precondidons  could  be  us^  to  produce  an  automadc 
precondidon  calculator. 
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